A PDF is not enough: Crowdsourcing the T1 mapping common ground via the ISMRM reproducibility challenge

*Mathieu Boudreau1,2, *Agah Karakuzu1, Julien Cohen-Adad1,3,4,5, Ecem Bozkurt6, Madeline Carr7,8, Marco Castellaro9, Luis Concha10, Mariya Doneva11, Seraina Dual12, Alex Ensworth13,14, Alexandru Foias1, Véronique Fortier15,16, Refaat E. Gabr17, Guillaume Gilbert18, Carri K. Glide-Hurst19, Matthew Grech-Sollars20,21, Siyuan Hu22, Oscar Jalnefjord23,24, Jorge Jovicich25, Kübra Keskin6, Peter Koken11, Anastasia Kolokotronis13,26, Simran Kukran27,28, Nam. G. Lee6, Ives R. Levesque13,29, Bochao Li6, Dan Ma22, Burkhard Mädler30, Nyasha Maforo31,32, Jamie Near33,34, Erick Pasaye10, Alonso Ramirez-Manzanares35, Ben Statton36,Christian Stehning30, Stefano Tambalo25, Ye Tian6, Chenyang Wang37, Kilian Weiss30, Niloufar Zakariaei38, Shuo Zhang30, Ziwei Zhao6, Nikola Stikov1,2,39

  • *Authors MB and AK contributed equally to this work

1NeuroPoly Lab, Polytechnique Montréal, Montreal, Quebec, Canada, 2Montreal Heart Institute, Montreal, Quebec, Canada, 3Unité de Neuroimagerie Fonctionnelle (UNF), Centre de recherche de l’Institut Universitaire de Gériatrie de Montréal (CRIUGM), Montreal, Quebec, Canada, 4Mila - Quebec AI Institute, Montreal, QC, Canada, 5Centre de recherche du CHU Sainte-Justine, Université de Montréal, Montreal, QC, Canada, 6Magnetic Resonance Engineering Laboratory (MREL), University of Southern California, Los Angeles, California, USA, 7Medical Physics, Ingham Institute for Applied Medical Research, Liverpool, Australia, 8Department of Medical Physics, Liverpool and Macarthur Cancer Therapy Centres, Liverpool, Australia, 9Department of Information Engineering, University of Padova, Padova, Italy, 10Institute of Neurobiology, Universidad Nacional Autónoma de México Campus Juriquilla, Querétaro, México, 11Philips Research Hamburg, Germany, 12Stanford University, Stanford, California, United States, 13Medical Physics Unit, McGill University, Montreal, Canada, 14University of British Columbia, Vancouver, Canada, 15Department of Medical Imaging, McGill University Health Centre, Montreal, Quebec, Canada 16Department of Radiology, McGill University, Montreal, Quebec, Canada, 17Department of Diagnostic and Interventional Imaging, University of Texas Health Science Center at Houston, McGovern Medical School, Houston, Texas, USA, 18MR Clinical Science, Philips Canada, Mississauga, Ontario, Canada, 19Department of Human Oncology, University of Wisconsin-Madison, Madison, Wisconsin, USA, 20Centre for Medical Image Computing, Department of Computer Science, University College London, London, UK, 21Lysholm Department of Neuroradiology, National Hospital for Neurology and Neurosurgery, University College London Hospitals NHS Foundation Trust, London, UK, 22Department of Biomedical Engineering, Case Western Reserve University, Cleveland, Ohio, USA, 23Department of Medical Radiation Sciences, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden, 24Biomedical Engineering, Sahlgrenska University Hospital, Gothenburg, Sweden, 25Center for Mind/Brain Sciences, University of Trento, Italy, 26Hopital Maisonneuve-Rosemont, Montreal, Canada, 27Bioengineering, Imperial College London, UK, 28Radiotherapy and Imaging, Insitute of Cancer Research, Imperial College London, UK, 29Research Institute of the McGill University Health Centre, Montreal, Canada, 30Clinical Science, Philips Healthcare, Germany, 31Department of Radiological Sciences, University of California Los Angeles, Los Angeles, CA, USA, 32Physics and Biology in Medicine IDP, University of California Los Angeles, Los Angeles, CA, USA, 33Douglas Brain Imaging Centre, Montreal, Canada, 34Sunnybrook Research Institute, Toronto, Canada, 35Computer Science Department, Centro de Investigación en Matemáticas, A.C., Guanajuato, México, 36Medical Research Council, London Institute of Medical Sciences, Imperial College London, London, United Kingdom, 37Department of Radiation Oncology - CNS Service, The University of Texas MD Anderson Cancer Center, Texas, USA, 38Department of Biomedical Engineering, University of British Columbia, British Columbia, Canada, 39Center for Advanced Interdisciplinary Research, Ss. Cyril and Methodius University, Skopje, North Macedonia

Abstract#

Purpose: T1 mapping is a widely used quantitative MRI technique, but its tissue-specific values remain inconsistent across protocols, sites, and vendors. The ISMRM Reproducible Research study group (RRSG) and Quantitative MR study group (qMRSG) jointly launched a T1 mapping reproducibility challenge to assess the reproducibility of a well-established inversion recovery T1 mapping technique, published solely as a PDF, on a standardized phantom and in healthy human brains.

Methods: The challenge used the acquisition protocol and fitting algorithm from Barral et al. 2010. Participants collected T1 mapping data on the ISMRM/NIST phantom and/or in healthy human brains. Data submission, pipeline development, and analysis were conducted using open-source platforms. Inter-submission and intra-submission comparisons were performed using one dataset per submission.

Results: Eighteen submissions were accepted using data collected with three MRI manufacturers, primarily at 3T (with one submission at 0.35T). The study collected 39 phantom and 56 human datasets. The mean coefficient of variation (CoV) was 6.1% for inter-submission phantom measurements, which was nearly twice as high as the intra-submission CoV (2.9%). For human data, inter-/intra-submission CoV was 5.9/3.2 % in the genu of the corpus callosum and 16/6.9 % in the cortical gray matter. To facilitate broader community access and engagement, an interactive dashboard was developed and is available athttps://rrsg2020.dashboards.neurolibre.org.

Conclusion: The inter-submission variability was twice as high as the intra-submission variability in both phantom and human brain T1 measurements, indicating that the published PDF was not sufficient to reproduce a quantitative MRI protocol.

Dashboard: Challenge Submissions

1     |     INTRODUCTION#

Significant challenges exist in the reproducibility of quantitative MRI (qMRI) [1]. Despite its promise of improving the specificity and reproducibility of MRI acquisitions, few qMRI techniques have been integrated into clinical practice. Even the most fundamental MR parameters cannot be measured with sufficient reproducibility and precision across clinical scanners to pass the second of six stages of technical assessment for clinical biomarkers [2–4]. Half a century has passed since the first quantitative T1 (spin-lattice relaxation time) measurements were first reported as a potential biomarker for tumors [5], followed shortly thereafter by the first in vivo T1 maps [6] of tumors, but there is still disagreement in reported values for this fundamental parameter across different sites, vendors, and implementations [7].

Among fundamental MRI parameters, T1 holds significant importance [8]. T1 represents the time constant for recovery of equilibrium longitudinal magnetization. T1 values will vary depending on the molecular mobility and magnetic field strength [9–11]. Knowledge of the T1 values for tissue is crucial for optimizing clinical MRI sequences for contrast and time efficiency [12–14] and to calibrate other quantitative MRI techniques [15,16]. Inversion recovery (IR) [17,18] is considered the gold standard for T1 measurement due to its robustness, but its long acquisition times limit the clinical use of IR for T1 mapping [7]. In practice, IR is often used as a reference for validating other T1 mapping techniques, such as variable flip angle imaging (VFA) [19–21], Look-Locker [22–24], and MP2RAGE [25,26].

In ongoing efforts to standardize T1 mapping methods, researchers have been actively developing quantitative MRI phantoms [27]. The International Society for Magnetic Resonance in Medicine (ISMRM) and the National Institute of Standards and Technology (NIST) collaborated on a standard system phantom [28], which was subsequently commercialized (Premium System Phantom, CaliberMRI, Boulder, Colorado). This phantom has since been used in large multicenter studies, such as Bane et al. [29] which concluded that acquisition protocol and field strength influence accuracy, repeatability, and interplatform reproducibility. Another NIST-led study [30] found no significant T1 discrepancies among measurements using NIST protocols across 27 MRI systems from three vendors at two clinical field strengths.

The 2020 ISMRM reproducibility challenge posed the following question: can an imaging protocol, independently implemented at multiple centers, consistently measure one of the fundamental MRI parameters (T1)? To assess this, we proposed using inversion recovery on a standardized phantom (ISMRM/NIST system phantom) and the healthy human brain. Specifically, this challenge explored whether the information provided in a published PDF of a seminal paper on T1 mapping [31] is sufficient to ensure the reproducibility across independent research imaging groups.

2     |     METHODS#

2.1     |     Phantom and human data#

The challenge asked participants with access to the ISMRM/NIST system phantom [28] (Premium System Phantom, CaliberMRI, Boulder, Colorado) to measure T1 maps of the phantom’s T1 plate (Table 1). Researchers that participated in the challenge were instructed to record the temperature before and after scanning the phantom using the phantom’s internal thermometer. Instructions for positioning and setting up the phantom were devised by NIST and were provided to participants through their website . In brief, the instructions explained how to orient the phantom and how long the phantom should be in the scanner room prior to scanning to achieve thermal equilibrium.

Table 1 Reference T1 values of the NiCl2 array of the standard system phantom (for both phantom versions) measured at 20 °C and 3T. Phantoms with serial numbers 0042 or less are referred to as “Version 1”, and those 0043 or greater are “Version 2”.#

Sphere

Version 1 (ms)

Version 2 (ms)

1

1989 ± 1.0

1883.97 ± 30.32

2

1454 ± 2.5

1330.16 ± 20.41

3

984.1 ± 0.33

987.27 ± 14.22

4

706 ± 1.0

690.08 ± 10.12

5

496.7 ± 0.41

484.97 ± 7.06

6

351.5 ± 0.91

341.58 ± 4.97

7

247.13 ± 0.086

240.86 ± 3.51

8

175.3 ± 0.11

174.95 ± 2.48

9

125.9 ± 0.33

121.08 ± 1.75

10

89.0 ± 0.17

85.75 ± 1.24

11

62.7 ± 0.13

60.21 ± 0.87

12

44.53 ± 0.090

42.89 ± 0.44

13

30.84 ± 0.016

30.40 ± 0.62

14

21.719 ± 0.005

21.44 ± 0.31

Participants were also instructed to collect T1 maps in healthy human brains, and were asked to measure a single slice positioned parallel to the anterior commissure - posterior commissure (AC-PC) line. Prior to imaging, the participants consented to share their de-identified data with the challenge organizers and on the Open Science Framework (OSF.io) website. As the submitted data was a single slice, the researchers were not instructed to de-face the data of their participants. Researchers submitting human data provided written confirmation to the organizers that their data was acquired in accordance with their institutional ethics committee (or equivalent regulatory body) and that the subjects had consented to data sharing as outlined in the challenge.

2.2     |     MRI acquisition protocol#

Participants followed the inversion recovery T1 mapping protocol optimized for the human brain as described in the published PDF [31], which consisted of: TR = 2550 ms, TIs = 50, 400, 1100, 2500 ms, TE = 14 ms, 2 mm slice thickness and 1×1 mm2 in-plane resolution. Note that this protocol is not suitable for fitting models that assume TR > 5T1. Instead, the more general Barral et al. [31] fitting model described in Section 2.4.1 can be used, and this model is compatible with both magnitude-only and complex data. Researchers were instructed to closely adhere to this protocol and report any deviations due to technical limitations.

2.3     |     Data Submissions#

Data submissions for the challenge were handled through a GitHub repository https://github.com/rrsg2020/data_submission, enabling a standardized and transparent process. All datasets were converted to the NIfTI format, and images for all TIs were concatenated along the fourth dimension. Each submission included a YAML file to store additional information (submitter details, acquisition details, and phantom or human subject details). Submissions were reviewed, and following acceptance the datasets were uploaded to OSF.io (osf.io/ywc9g/). A Jupyter Notebook [32,33] pipeline using qMRLab [34,35] was used to process the T1 maps and to conduct quality-control checks. MyBinder links to Jupyter notebooks that reproduced each T1 map were shared in each respective submission GitHub issue to easily reproduce the results in web browsers while maintaining consistent computational environments. Eighteen submissions were included in the analysis, which resulted in 39 T1 maps of the NIST/system phantom, and 56 brain T1 maps. Figure 1 illustrates all the submissions that acquired phantom data (Figure 1-a) and human data (Figure 1-b), the MRI scanner vendors, and the resulting T1 mapping datasets. Some submissions included measurements where both complex and magnitude-only data from the same acquisition were used to fit T1 maps, thus the total number of unique acquisitions is lower than the numbers reported above (27 for phantom data and 44 for human data). The datasets were collected on systems from three MRI manufacturers (Siemens, GE, Philips) and were acquired at 3T , except for one dataset acquired at 0.35T (the ViewRay MRidian MR-linac).

_images/figure_1.png

2.4     |     Fitting Model and Pipeline#

A reduced-dimension non-linear least squares (RD-NLS) approach was used to fit the complex general inversion recovery signal equation:

(1)#\[S(TI) = a + be^{-TI/T_1}\]

where a and b are complex constants. This approach, developed by Barral et al. [31], offers a model for the general T1 signal equation without relying on the long-TR approximation. The a and b constants inherently factor TR in them, as well as other imaging parameters such as excitation and inversion pulse flip angles, TE, etc. Barral et al. [31] shared their MATLAB (MathWorks, Natick, MA) code for the fitting algorithm used in their paper . Magnitude-only data were fitted to a modified version of Eq. 1 (Eq. 15 of Barral et al. 2010) with signal-polarity restoration by finding the signal minima, fitting the inversion recovery curve for two cases (data points for TI < TIminimum flipped, and data points for TI ≤ TIminimum flipped), and selecting the case that resulted in the best fit. This code is available as part of the open-source software qMRLab [34,35], which provides a standardized application program interface (API) to call the fitting in MATLAB/Octave scripts.

A data processing pipeline was written using MATLAB/Octave in a Jupyter Notebook. This pipeline downloads every dataset from osf.io (osf.io/ywc9g/), loads their configuration file, fits the T1 maps, and then saves them to NIfTI and PNG formats. The code is available on GitHub (https://github.com/rrsg2020/t1_fitting_pipeline, filename: RRSG_T1_fitting.ipynb). Finally, T1 maps were manually uploaded to OSF (https://osf.io/ywc9g/).

2.5     |     Image Labeling & Registration#

The T1 plate (NiCl2 array) of the phantom has 14 spheres that were labeled as the regions-of-interest (ROI) using a numerical mask template created in MATLAB, provided by NIST researchers (Figure 1-c). To avoid potential edge effects in the T1 maps, the ROI labels were reduced to 60% of the expected sphere diameter. A registration pipeline in Python using the Advanced Normalization Tools (ANTs) [36] was developed and shared in the analysis repository of our GitHub organization (https://github.com/rrsg2020/analysis, filename: register_t1maps_nist.py, commit ID: 8d38644). Briefly, a label-based registration was first applied to obtain a coarse alignment, followed by an affine registration (gradientStep: 0.1, metric: cross correlation, number of steps: 3, iterations: 100/100/100, smoothness: 0/0/0, sub-sampling: 4/2/1) and a BSplineSyN registration (gradientStep:0.5, meshSizeAtBaseLevel:3, number of steps: 3, iterations: 50/50/10, smoothness: 0/0/0, sub-sampling: 4/2/1). The ROI labels template was nonlinearly registered to each T1 map uploaded to OSF.

For human data, manual ROIs were segmented by a single researcher (M.B., 11+ years of neuroimaging experience) using FSLeyes [37] in four regions (Figure 1-d): located in the genu, splenium, deep gray matter, and cortical gray matter. Automatic segmentation was not used because the data were single-slice and there was inconsistent slice positioning between datasets.

2.6     |     Analysis and Statistics#

Analysis code and scripts were developed and shared in a version-controlled public GitHub repository . The T1 fitting and data analysis were performed by M.B., one of the challenge organizers. Computational environment requirements were containerized in Docker [38,39] to create an executable environment that allows for analysis reproduction in a web browser via MyBinder [40]. Backend Python files handled reference data, database operations, ROI masking, and general analysis tools. Configuration files handled dataset information, and the datasets were downloaded and pooled using a script (make_pooled_datasets.py). The databases were created using a reproducible Jupyter Notebook script and subsequently saved in the repository.

The mean T1 values of the ISMRM/NIST phantom data for each ROI were compared with temperature-corrected reference values and visualized in three different types of plots (linear axes, log-log axes, and error relative to the reference value). Temperature correction involved nonlinear interpolation of a NIST reference table of T1 values for temperatures ranging from 16 °C to 26 °C (2 °C intervals) as specified in the phantom’s technical specifications. For the human datasets, the mean and standard deviations for each tissue ROI were calculated from all submissions across all sites. All quality assurance and analysis plot images were stored in the repository. Additionally, the database files of ROI values and acquisition details for all submissions were also stored in the repository.

2.7     |     Dashboard#

To widely disseminate the challenge results, a web-based dashboard was developed (Figure 2, https://rrsg2020.dashboards.neurolibre.org). The landing page (Figure 2-a) showcases the relationship between the phantom and brain datasets acquired at different sites/vendors. Selecting the Phantom or In Vivo icons and then clicking an ROI will display whisker plots for that region. Additional sections of the dashboard allow for displaying statistics summaries for both sets of data, a magnitude vs complex data fitting comparison, and hierarchical shift function analyses.

Figure 2 Dashboard. a) Welcome page listing all the sites, the types of subject, and scanner, and the relationship between the three. b) The phantom tab for a selected ROI, and c) The in vivo tab for a selected ROI. Link: https://rrsg2020.dashboards.neurolibre.org

3     |     RESULTS#

Figure 3 presents a comprehensive overview of the challenge results through violin plots, depicting inter- and intra- submission comparisons in both phantoms (a) and human (b) datasets. Inter-submission coefficients of variation (CoV) were computed by selecting a single T1 map submitted by each challenge participant and calculating the CoV. For the phantom (Figure 3-a), the average inter-submission CoV for the first five spheres, representing the expected T1 value range in the human brain (approximately 500 to 2000 ms) was 6.1 %. By addressing outliers from two sites associated with specific challenges for sphere 4 (signal null near a TI), the mean inter-submission CoV reduced to 4.1%. One participant (submission 6, Figure 1) measured T1 maps using a consistent protocol at 7 different sites, and the mean intra-submission CoV across the first five spheres for this submission was calculated to be 2.9 %.

For the human datasets, inter-submission CoVs for independently-implemented imaging protocols were 5.9% for genu, 10.6% for splenium, 16% for cortical GM, and 22% for deep GM. One participant (submission 18, Figure 1) measured a large dataset (13 individuals) on three scanners and two vendors, and the intra-submission CoVs for this submission were 3.2% for genu, 3.1% for splenium, 6.9 % for cortical GM, and 7.1% for deep GM.

_images/figure_3.png

Figure 3 Summary of results of the challenge as violin plots displaying the inter- and intra- submission dataset comparisons for phantoms (a) and human brains (b). Interactive figure available at: https://preprint.neurolibre.org/10.55458/neurolibre.00014/.

A scatterplot of the T1 data for all submissions and their ROIs is shown in Figure 4 (phantom a-c, and human brains d-f). The NIST T1 data is shown in Figure 4 a-c, and the same ROI T1 values are presented in each plot for different axes types (linear, log, and error) to better visualize the results. Figure 4-a shows good agreement for this dataset in comparison with the temperature-corrected reference T1 values. However, this trend did not persist for low T1 values (T1 < 100-200 ms), as seen in the log-log plot (Figure 4-b), which was expected because the imaging protocol is optimized for human water-based T1 values (T1 > 500 ms). Higher variability is seen at long T1 values (T1 ~ 2000 ms) in Figure 4-a. Errors exceeding 10% are observed in the phantom spheres with T1 values below 300 ms (Figure 4-c), and 3-4 measurements with outlier values exceeding 10% error were observed in the human water-based tissue range (~500-2000 ms).

Figure 4 d-f displays the scatter plot data for human datasets submitted to this challenge, showing mean and standard deviation T1 values from the WM (genu and splenium) and GM (cerebral cortex and deep GM) ROIs. Mean WM T1 values across all submissions were 828 ± 38 ms in the genu and 852 ± 49 ms in the splenium, and mean GM T1 values were 1548 ± 156 ms in the cortex and 1188 ± 133 ms in the deep GM, with less variations overall in WM compared to GM, possibly due to better ROI placement and less partial voluming in WM. The lower standard deviations for the ROIs of human database ID site 9 (submission 18, Figure 1) are due to good slice positioning, cutting through the AC-PC line and the genu for proper ROI placement, particularly for the corpus callosum and deep GM.

Figure 4 Measured mean T1 values vs. temperature-corrected NIST reference values of the phantom spheres are presented as linear plots (a), log-log plots (b), and plots of the error relative to reference T1 value (c). The dashed lines in plots (c) represent a ±10 % error. Mean T1 values in two sets of ROIs, white matter (one 5⨯5 voxel ROI, genu) and gray matter (three 3⨯3 voxel ROIs, cortex). Top figure shows all datasets collapsed into sites, whereas the bottom shows each individual dataset. In subplot g), the missing datapoint for deep GM in 10.001 was due to the slice positioning of the acquisition not containing deep GM. Interactive figure available at: https://preprint.neurolibre.org/10.55458/neurolibre.00014/.

4     |     DISCUSSION#

The challenge focused on exploring if different research groups could reproduce T1 maps based on the protocol information reported in a published PDF [31]. Eighteen submissions independently implemented the inversion recovery T1 mapping acquisition protocol as outlined in Barral et al. [31], and reported T1 mapping data in a standard quantitative MRI phantom and/or human brains at 27 MRI sites, using systems from three different vendors (GE, Philips, Siemens). The collaborative effort produced an open-source database of 94 T1 mapping datasets, including 38 ISMRM/NIST phantom and 56 human brain datasets. The inter-submission variability was twice as high as the intra-submission variability in both phantom and human brain T1 measurements, demonstrating that a PDF is not enough for reproducibility in quantitative MRI.

More information is needed to unify all the aspects of a pulse sequence across sites. However, in a vendor-native setting, this is a major challenge given the disparities between proprietary development libraries [41]. Vendor-neutral pulse sequence design platforms [42–44] have emerged as a powerful solution to standardize sequence components at the implementation level. Vendor neutrality has been shown to significantly reduce the variability of T1 maps acquired using VFA across vendors [44]. In the absence of a vendor-neutral framework, a vendor-native alternative is the implementation of a strategy to control the saturation of MT across TRs [45]. Nevertheless, this approach can still benefit from a vendor-neutral approach to enhance accessibility and unify implementations. This is because vendor-specific constraints are recognized to impose limitations on the adaptability of sequences, resulting in significant variability even when implementations are closely aligned within their respective vendor-native development environments [46].

The 2020 Reproducibility Challenge, jointly organized by the Reproducible Research and Quantitative MR ISMRM study groups, led to the creation of a large open database of standard quantitative MR phantom and human brain inversion recovery T1 maps. These maps were measured using independently implemented imaging protocols on MRI scanners from three different manufacturers. All collected data, processing pipeline code, computational environment files, and analysis scripts were shared with the goal of promoting reproducible research practices, and an interactive dashboard was developed to broaden the accessibility and engagement of the resulting datasets (https://rrsg2020.dashboards.neurolibre.org). The differences in stability between independently implemented (inter-submission) and centrally shared (intra-submission) protocols observed both in phantoms and in vivo could help inform future meta-analyses of quantitative MRI metrics [47,48] and better guide multi-center collaborations.

ACKNOWLEDGEMENT#

The conception of this collaborative reproducibility challenge originated from discussions with experts, including Paul Tofts, Joëlle Barral, and Ilana Leppert, who provided valuable insights. Additionally, Kathryn Keenan, Zydrunas Gimbutas, and Andrew Dienstfrey from NIST provided their code to generate the ROI template for the ISMRM/NIST phantom. Dylan Roskams-Edris and Gabriel Pelletier from the Tanenbaum Open Science Institute (TOSI) offered valuable insights and guidance related to data ethics and data sharing in the context of this international multi-center conference challenge. The 2020 RRSG study group committee members who launched the challenge, Martin Uecker, Florian Knoll, Nikola Stikov, Maria Eugenia Caligiuri, and Daniel Gallichan, as well as the 2020 qMRSG committee members, Kathryn Keenan, Diego Hernando, Xavier Golay, Annie Yuxin Zhang, and Jeff Gunter, also played an essential role in making this challenge possible. We’d also like to extend our thanks to all the volunteers and individuals who helped with the scanning at each imaging site. The authors thank the ISMRM Reproducible Research Study Group for conducting a code review of the code (Version 1) supplied in the Data Availability Statement. The scope of the code review covered only the code’s ease of download, quality of documentation, and ability to run, but did not consider scientific accuracy or code efficiency.

Lastly, we acknowledge use of ChatGPT (v3), a generative language model, for accelerating manuscript preparation. The co-first authors employed ChatGPT in the initial draft for transforming bullet point sentences into paragraphs, proofreading for typos, and refining the academic tone. ChatGPT served exclusively as a writing aid, and was not used to create or interpret results.

DATA AVAILABILITY STATEMENT#

An interactive NeuroLibre preprint of this manuscript is available at https://preprint.neurolibre.org/10.55458/neurolibre.00014/. All imaging data submitted to the challenge, dataset details, registered ROI maps, and processed T1 maps are hosted on OSF https://osf.io/ywc9g/. The dataset submissions and quality assurance were handled through GitHub issues in this repository https://github.com/rrsg2020/data_submission (commit: 9d7eff1). Note that accepted submissions are closed issues, and that the GitHub branches associated with the issue numbers contain the Dockerfile and Jupyter Notebook scripts that reproduce these preliminary quality assurance results and can be run in a browser using MyBinder. The ROI registration scripts for the phantoms and T1 fitting pipeline to process all datasets are hosted in this GitHub repository, https://github.com/rrsg2020/t1_fitting_pipeline (commit: 3497a4e). All the analyses of the datasets were done using Jupyter Notebooks and are available in this repository, https://github.com/rrsg2020/analysis (commit: 8d38644), which also contains a Dockerfile to reproduce the environment using a tool like MyBinder. A dashboard was developed to explore the datasets information and results in a browser, which is accessible here, https://rrsg2020.dashboards.neurolibre.org, and the code is also available on GitHub: https://github.com/rrsg2020/rrsg2020-dashboard (commit: 6ee9321).

References#

  1. Keenan KE, Biller JR, Delfino JG, Boss MA, Does MD, Evelhoch JL, et al. Recommendations towards standards for quantitative MRI (qMRI) and outstanding needs. J Magn Reson Imaging. 2019;49: e26–e39.

  2. Fryback DG, Thornbury JR. The efficacy of diagnostic imaging. Med Decis Making. 1991;11: 88–94.

  3. Schweitzer M. Stages of technical efficacy: Journal of Magnetic Resonance Imaging style. J Magn Reson Imaging. 2016;44: 781–782.

  4. Seiberlich N, Gulani V, Campbell A, Sourbron S, Doneva MI, Calamante F, et al. Quantitative Magnetic Resonance Imaging. Academic Press; 2020.

  5. Damadian R. Tumor detection by nuclear magnetic resonance. Science. 1971;171: 1151–1153.

  6. Pykett IL, Mansfield P. A line scan image study of a tumorous rat leg by NMR. Phys Med Biol. 1978;23: 961–967.

  7. Stikov N, Boudreau M, Levesque IR, Tardif CL, Barral JK, Pike GB. On the accuracy of T1 mapping: Searching for common ground. Magn Reson Med. 2015;73: 514–522.

  8. Boudreau M, Keenan KE, Stikov N. Quantitative T1 and T1r Mapping. Quantitative Magnetic Resonance Imaging. 2020. pp. 19–45.

  9. Bottomley PA, Foster TH, Argersinger RE, Pfeifer LM. A review of normal tissue hydrogen NMR relaxation times and relaxation mechanisms from 1-100 MHz: dependence on tissue type, NMR frequency, temperature, species, excision, and age. Med Phys. 1984;11: 425–448.

  10. Wansapura JP, Holland SK, Dunn RS, Ball WS Jr. NMR relaxation times in the human brain at 3.0 tesla. J Magn Reson Imaging. 1999;9: 531–538.
    
  11. Dieringer MA, Deimling M, Santoro D, Wuerfel J, Madai VI, Sobesky J, et al. Rapid parametric mapping of the longitudinal relaxation time T1 using two-dimensional variable flip angle magnetic resonance imaging at 1.5 Tesla, 3 Tesla, and 7 Tesla. PLoS One. 2014;9: e91318.
    
  12. Ernst RR, Anderson WA. Application of Fourier Transform Spectroscopy to Magnetic Resonance. Rev Sci Instrum. 1966;37: 93–102.
    
  13. Redpath TW, Smith FW. Technical note: use of a double inversion recovery pulse sequence to image selectively grey or white brain matter. Br J Radiol. 1994;67: 1258–1263.
    
  14. Tofts PS. Modeling tracer kinetics in dynamic Gd-DTPA MR imaging. J Magn Reson Imaging. 1997;7: 91–101.
    
  15. Sled JG, Pike GB. Quantitative imaging of magnetization transfer exchange and relaxation properties in vivo using MRI. Magn Reson Med. 2001;46: 923–931.
    
  16. Yuan J, Chow SKK, Yeung DKW, Ahuja AT, King AD. Quantitative evaluation of dual-flip-angle T1 mapping on DCE-MRI kinetic parameter estimation in head and neck. Quant Imaging Med Surg. 2012;2: 245–253.
    
  17. Drain LE. A Direct Method of Measuring Nuclear Spin-Lattice Relaxation Times. Proc Phys Soc A. 1949;62: 301.
    
  18. Hahn EL. An Accurate Nuclear Magnetic Resonance Method for Measuring Spin-Lattice Relaxation Times. Physical Review. 1949. pp. 145–146. doi:10.1103/physrev.76.145
    
  19. Fram EK, Herfkens RJ, Johnson GA, Glover GH, Karis JP, Shimakawa A, et al. Rapid calculation of T1 using variable flip angle gradient refocused imaging. Magn Reson Imaging. 1987;5: 201–208.
    
  20. Deoni SCL, Rutt BK, Peters TM. Rapid combinedT1 andT2 mapping using gradient recalled acquisition in the steady state. Magnetic Resonance in Medicine. 2003. pp. 515–526. doi:10.1002/mrm.10407
    
  21. Cheng H-LM, Wright GA. Rapid high-resolutionT1 mapping by variable flip angles: Accurate and precise measurements in the presence of radiofrequency field inhomogeneity. Magnetic Resonance in Medicine. 2006. pp. 566–574. doi:10.1002/mrm.20791
    
  22. Look DC, Locker DR. Time saving in measurement of NMR and EPR relaxation times. Rev Sci Instrum. 1970;41: 250–251.
    
  23. Messroghli DR, Radjenovic A, Kozerke S, Higgins DM, Sivananthan MU, Ridgway JP. Modified Look-Locker inversion recovery (MOLLI) for high-resolution T1 mapping of the heart. Magn Reson Med. 2004;52: 141–146.
    
  24. Piechnik SK, Ferreira VM, Dall’Armellina E, Cochlin LE, Greiser A, Neubauer S, et al. Shortened Modified Look-Locker Inversion recovery (ShMOLLI) for clinical myocardial T1-mapping at 1.5 and 3 T within a 9 heartbeat breathhold. J Cardiovasc Magn Reson. 2010;12: 69.
    
  25. Marques JP, Kober T, Krueger G, van der Zwaag W, Van de Moortele P-F, Gruetter R. MP2RAGE, a self bias-field corrected sequence for improved segmentation and T1-mapping at high field. NeuroImage. 2010. pp. 1271–1281. doi:10.1016/j.neuroimage.2009.10.002
    
  26. Marques JP, Gruetter R. New developments and applications of the MP2RAGE sequence--focusing the contrast and high spatial resolution R1 mapping. PLoS One. 2013;8: e69294.
    
  27. Keenan KE, Ainslie M, Barker AJ, Boss MA, Cecil KM, Charles C, et al. Quantitative magnetic resonance imaging phantoms: A review and the need for a system phantom. Magn Reson Med. 2018;79: 48–61.
    
  28. Stupic KF, Ainslie M, Boss MA, Charles C, Dienstfrey AM, Evelhoch JL, et al. A standard system phantom for magnetic resonance imaging. Magn Reson Med. 2021;86: 1194–1211.
    
  29. Bane O, Hectors SJ, Wagner M, Arlinghaus LL, Aryal MP, Cao Y, et al. Accuracy, repeatability, and interplatform reproducibility of T1 quantification methods used for DCE-MRI: Results from a multicenter phantom study. Magn Reson Med. 2018;79: 2564–2575.
    
  30. Keenan KE, Gimbutas Z, Dienstfrey A, Stupic KF, Boss MA, Russek SE, et al. Multi-site, multi-platform comparison of MRI T1 measurement using the system phantom. PLoS One. 2021;16: e0252966.
    
  31. Barral JK, Gudmundson E, Stikov N, Etezadi-Amoli M, Stoica P, Nishimura DG. A robust methodology for in vivo T1 mapping. Magn Reson Med. 2010;64: 1057–1067.
    
  32. Kluyver T, Ragan-Kelley B, Granger B, Bussonnier M, Frederic J, Kelley K, et al. Jupyter Notebooks – a publishing format for reproducible computational workflows. Positioning and Power in Academic Publishing: Players, Agents and Agendas. Amsterdam, NY: IOS Press; 2016. pp. 87–90.
    
  33. Beg, Taka, Kluyver, Konovalov, Ragan-Kelley, Thiery, et al. Using Jupyter for Reproducible Scientific Workflows. https://www.computer.org › csdl › magazine › 2021/02https://www.computer.org › csdl › magazine › 2021/02. 2021;23: 36–46.
    
  34. Karakuzu A, Boudreau M, Duval T, Boshkovski T, Leppert I, Cabana J-F, et al. qMRLab: Quantitative MRI analysis, under one umbrella. J Open Source Softw. 2020;5: 2343.
    
  35. Cabana J-F, Gu Y, Boudreau M, Levesque IR, Atchia Y, Sled JG, et al. Quantitative magnetization transfer imagingmadeeasy with qMTLab: Software for data simulation, analysis, and visualization. Concepts Magn Reson Part A Bridg Educ Res. 2015;44A: 263–277.
    
  36. Avants BB, Tustison N, Song G. Advanced normalization tools (ANTS). Insight J. 2009;2: 1–35.
    
  37. McCarthy P. FSLeyes. 2019. doi:10.5281/zenodo.3403671
    
  38. Merkel D. Docker: Lightweight Linux containers for consistent development and deployment. 2014 [cited 14 Feb 2023]. Available: https://www.seltzer.com/margo/teaching/CS508.19/papers/merkel14.pdf
    
  39. Boettiger C. An introduction to Docker for reproducible research. Oper Syst Rev. 2015;49: 71–79.
    
  40. Project Jupyter, Bussonnier M, Forde J, Freeman J, Granger B, Head T, et al. Binder 2.0 - Reproducible, interactive, sharable environments for science at scale. Proceedings of the Python in Science Conference. SciPy; 2018. doi:10.25080/majora-4af1f417-011
    
  41. Gracien, Maiworm, Brüche, Shrestha. How stable is quantitative MRI?–Assessment of intra-and inter-scanner-model reproducibility using identical acquisition sequences and data analysis …. Neuroimage. 2020. Available: https://www.sciencedirect.com/science/article/pii/S1053811919309553
    
  42. Layton KJ, Kroboth S, Jia F, Littin S, Yu H, Leupold J, et al. Pulseq: A rapid and hardware-independent pulse sequence prototyping framework. Magn Reson Med. 2017;77: 1544–1552.
    
  43. Cordes C, Konstandin S, Porter D, Günther M. Portable and platform-independent MR pulse sequence programs. Magn Reson Med. 2020;83: 1277–1290.
    
  44. Karakuzu A, Biswas L, Cohen-Adad J, Stikov N. Vendor-neutral sequences and fully transparent workflows improve inter-vendor reproducibility of quantitative MRI. Magn Reson Med. 2022;88: 1212–1228.
    
  45. A G Teixeira RP, Neji R, Wood TC, Baburamani AA, Malik SJ, Hajnal JV. Controlled saturation magnetization transfer for reproducible multivendor variable flip angle T1 and T2 mapping. Magn Reson Med. 2020;84: 221–236.
    
  46. Lee Y, Callaghan MF, Acosta-Cabronero J, Lutti A, Nagy Z. Establishing intra- and inter-vendor reproducibility of T1 relaxation time measurements with 3T MRI. Magn Reson Med. 2019;81: 454–465.
    
  47. Mancini M, Karakuzu A, Cohen-Adad J, Cercignani M, Nichols TE, Stikov N. An interactive meta-analysis of MRI biomarkers of myelin. Elife. 2020;9. doi:10.7554/eLife.61523
    
  48. Lazari A, Lipp I. Can MRI measure myelin? Systematic review, qualitative assessment, and meta-analysis of studies validating microstructural imaging with myelin histology. Neuroimage. 2021;230: 117744.